Evaluation of voice activity detection by combining multiple features with weight adaptation
نویسندگان
چکیده
For noise-robust automatic speech recognition (ASR), we propose a novel voice activity detection (VAD) method based on a combination of multiple features. The scheme uses a weighted combination of four conventional VAD features: amplitude level, zero crossing rate, spectral information, and Gaussian mixture model (GMM) likelihood. The weights for combination are adaptively updated using minimum classification error (MCE) training. In this paper, we first investigate the effect of adaptation of the combination weights and GMM parameters, and demonstrate that the weights can be effectively adapted with a single utterance. Then, we present application of the method to ASR. It is confirmed that the proposed method significantly outperforms conventional methods in various noise conditions.
منابع مشابه
Voice activity detection based on conditional random fields using multiple features
This paper proposes a Voice Activity Detection (VAD) algorithm based on Conditional Random Fields (CRF) using multiple features. VAD is a technique used to distinguish between speech and non-speech in noisy environments and is an important component in many real-world speech applications. The posterior probability of output labels in the proposed method is directly modeled by the weighted sum o...
متن کاملA New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)
Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...
متن کاملEvaluation the Mean Performance and Stability of Rice Genotypes by Combining Features of AMMI and BLUP Techniques and Selection Based on Multiple Traits
Additive main effect and multiplicative interaction (AMMI) and best linear unbiased prediction (BLUP) are two methods for analyzing multi-environment trials (MET). In this study, seven selected rice lines were evaluated along with two check varieties based on randomized complete block design in Tonekabon, Amol and Sari (Iran) in three growing seasons of 2011-14. To quantify the genotypic stabil...
متن کاملAutomated Detection of Multiple Sclerosis Lesions Using Texture-based Features and a Hybrid Classifier
Background: Multiple Sclerosis (MS) is the most frequent non-traumatic neurological disease capable of causing disability in young adults. Detection of MS lesions with magnetic resonance imaging (MRI) is the most common technique. However, manual interpretation of vast amounts of data is often tedious and error-prone. Furthermore, changes in lesions are often subtle and extremely unrepresentati...
متن کاملImproving Deep Neural Network Based Speech Enhancement in Low SNR Environments
We propose a joint framework combining speech enhancement (SE) and voice activity detection (VAD) to increase the speech intelligibility in low signal-noise-ratio (SNR) environments. Deep Neural Networks (DNN) have recently been successfully adopted as a regression model in SE. Nonetheless, the performance in harsh environments is not always satisfactory because the noise energy is often domina...
متن کامل